Learning theory analysis for association rules and sequential event prediction

نویسندگان

  • Cynthia Rudin
  • Benjamin Letham
  • David Madigan
چکیده

We present a theoretical analysis for prediction algorithms based on association rules. As part of this analysis, we introduce a problem for which rules are particularly natural, called “sequential event prediction.” In sequential event prediction, events in a sequence are revealed one by one, and the goal is to determine which event will next be revealed. The training set is a collection of past sequences of events. An example application is to predict which item will next be placed into a customer’s online shopping cart, given his/her past purchases. In the context of this problem, algorithms based on association rules have distinct advantages over classical statistical and machine learning methods: they look at correlations based on subsets of co-occurring past events (items a and b imply item c), they can be applied to the sequential event prediction problem in a natural way, they can potentially handle the “cold start” problem where the training set is small, and they yield interpretable predictions. In this work, we present two algorithms that incorporate association rules. These algorithms can be used both for sequential event prediction and for supervised classification, and they are simple enough that they can possibly be understood by users, customers, patients, managers, etc. We provide generalization guarantees on these algorithms based on algorithmic stability analysis from statistical learning theory. We include a discussion of the strict minimum support threshold often used in association rule mining, and introduce an “adjusted confidence” measure that provides a weaker minimum support condition that has advantages over the strict minimum support. The paper brings together ideas from statistical learning theory, association rule mining and Bayesian analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Learning Theory Framework for Association Rules and Sequential Events A Learning Theory Framework for Association Rules and Sequential Events

We present a framework and generalization analysis for the use of association rules in the setting of supervised learning. We are specifically interested in a sequential event prediction problem where data are revealed one by one, and the goal is to determine what will next be revealed. In the context of this problem, algorithms based on association rules have a distinct advantage over classica...

متن کامل

Inter-Transaction Association Rules Mining for Rare Events Prediction

Rare events prediction is a very interesting and critical issue that has been approached within various contexts by research areas, such as statistics and machine learning. Data mining has provided a set of tools to treat this problem when the size as well as the inherent features of the data, such as noise, randomness and special data types, become an issue for the traditional methods. Transac...

متن کامل

Sequential Event Prediction with Association Rules

We consider a supervised learning problem in which data are revealed sequentially and the goal is to determine what will next be revealed. In the context of this problem, algorithms based on association rules have a distinct advantage over classical statistical and machine learning methods; however, there has not previously been a theoretical foundation established for using association rules i...

متن کامل

Forecasting of Events by Tweet Data Mining

This paper describes the analysis of quantitative characteristics of frequent sets and association rules in the posts of Twitter microblogs related to different event discussions. For the analysis, we used a theory of frequent sets, association rules and a theory of formal concept analysis. We revealed the frequent sets and association rules which characterize the semantic relations between the...

متن کامل

Using Partially-Ordered Sequential Rules to Generate More Accurate Sequence Prediction

Predicting the next element(s) of a sequence is a research problem with wide applications such as stock market prediction, consumer product recommendation, and web link recommendation. To address this problem, an effective approach is to mine sequential rules from a set of training sequences to then use these rules to make predictions for new sequences. In this paper, we improve on this approac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2013